Emotional FESTIVAL-MBROLA TTS synthesis
نویسندگان
چکیده
The topic of this work is an extension of our previous research on the development of a general data-driven procedure for creating a neutral “narrative-style” prosodic module for the Italian FESTIVAL Text-To-Speech (TTS) synthesizer, and it is focused on investigating and implementing new strategies for building a new emotional FESTIVAL TTS. The new emotional prosodic modules, similarly to the neutral case, are still based on the “Classification And Regression Tree” (CART) theory. The extension to the emotional speech synthesis is obtained using a differential approach: the emotional prosodic modules learn the differences between the neutral (without emotions) and the emotional prosodic data. Moreover, due to the fact that Voice Quality (VQ) is known to play an important role in emotive speech, a rule-based FESTIVAL-MBROLA VQmodification module, for control of temporal and spectral characteristics of the synthesis, has also been implemented. Even if emotional synthesis still remains an attractive open issue, our preliminary evaluation results underline the effectiveness of the proposed solution.
منابع مشابه
On the First Greek-TTS Based on Festival Speech Synthesis
In this article we describe the first Text To Speech (TTS) system for the Greek language based on Festival architecture. We discuss practical implementation details and we capitalize on the preparation of the diphone database and on the prediction of phoneme duration module implemented with CART tree technique. Two male databases where used for two different speech synthesis engines, namely, re...
متن کاملA text-to-speech synthesis system for telugu
In this paper, a diphone based Text-to-Speech (TTS) system for the Telugu language is presented. Telugu is one of the main south-Indian languages spoken by more than 100 million people. Speech output is generated using the Festival Speech Synthesis System and the MBROLA synthesis engine. The design and collection of diphones and voice building process are described. Our text analysis module, th...
متن کاملSMS-FESTIVAL: a new TTS framework
A new sinusoidal model based engine for FESTIVAL TTS system which performs the DSP (Digital Signal Processing) operations (i.e. converting a phonetic input into audio signal) of a diphone-based TTS concatenative system, taking as input the NLP (Natural Language Processing) data (a sequence of phonemes with length and intonation values elaborated from the text script) computed by FESTIVAL is des...
متن کاملFestival speaks Italian!
Finally Festival speaks Italian. In this work, the development of the first Italian version of the Festival TTS system is described. One male and one female voice for three different speech engines are considered: the Festival-specific residual LPC synthesizer, the OGI residual LPC Plug-In for Festival and the MBROLA synthesizer. The new Italian voices will be freely available for download for ...
متن کاملProsodic data driven modelling of a narrative style in Festival TTS
A general data-driven procedure for creating new prosodic modules for the Italian FESTIVAL Text-To-Speech (TTS) [1] synthesizer is described. These modules are based on the “Classification and Regression Trees” (CART) theory. The prosodic factors taken into consideration are: duration, pitch and loudness. Loudness control has been implemented as an extension to the MBROLA diphone concatenative ...
متن کامل